Search CORE

129 research outputs found

Mean Field Bayes Backpropagation: scalable training of multilayer neural networks with binary weights

Author: Meir Ron
Soudry Daniel
Publication venue
Publication date: 24/10/2013
Field of study

Significant success has been reported recently using deep neural networks for classification. Such large networks can be computationally intensive, even after training is over. Implementing these trained networks in hardware chips with a limited precision of synaptic weights may improve their speed and energy efficiency by several orders of magnitude, thus enabling their integration into small and low-power electronic devices. With this motivation, we develop a computationally efficient learning algorithm for multilayer neural networks with binary weights, assuming all the hidden neurons have a fan-out of one. This algorithm, derived within a Bayesian probabilistic online setting, is shown to work well for both synthetic and real-world problems, performing comparably to algorithms with real-valued weights, while retaining computational tractability

arXiv.org e-Print Archive

CiteSeerX

Conductance-Based Neuron Models and the Slow Dynamics of Excitability

Author: Meir Ron
Soudry Daniel
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2012
Field of study

In recent experiments, synaptically isolated neurons from rat cortical culture, were stimulated with periodic extracellular fixed-amplitude current pulses for extended durations of days. The neuron’s response depended on its own history, as well as on the history of the input, and was classified into several modes. Interestingly, in one of the modes the neuron behaved intermittently, exhibiting irregular firing patterns changing in a complex and variable manner over the entire range of experimental timescales, from seconds to days. With the aim of developing a minimal biophysical explanation for these results, we propose a general scheme, that, given a few assumptions (mainly, a timescale separation in kinetics) closely describes the response of deterministic conductance-based neuron models under pulse stimulation, using a discrete time piecewise linear mapping, which is amenable to detailed mathematical analysis. Using this method we reproduce the basic modes exhibited by the neuron experimentally, as well as the mean response in each mode. Specifically, we derive precise closed-form input-output expressions for the transient timescale and firing rates, which are expressed in terms of experimentally measurable variables, and conform with the experimental results. However, the mathematical analysis shows that the resulting firing patterns in these deterministic models are always regular and repeatable (i.e., no chaos), in contrast to the irregular and variable behavior displayed by the neuron in certain regimes. This fact, and the sensitive near-threshold dynamics of the model, indicate that intrinsic ion channel noise has a significant impact on the neuronal response, and may help reproduce the experimentally observed variability, as we also demonstrate numerically. In a companion paper, we extend our analysis to stochastic conductance-based models, and show how these can be used to reproduce the details of the observed irregular and variable neuronal response

Crossref

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector

The Implicit Bias of Gradient Descent on Separable Data

Author: Gunasekar Suriya
Hoffer Elad
Nacson Mor Shpigel
Soudry Daniel
Srebro Nathan
Publication venue
Publication date: 28/12/2018
Field of study

We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets. We show the predictor converges to the direction of the max-margin (hard margin SVM) solution. The result also generalizes to other monotone decreasing loss functions with an infimum at infinity, to multi-class problems, and to training a weight layer in a deep network in a certain restricted setting. Furthermore, we show this convergence is very slow, and only logarithmic in the convergence of the loss itself. This can help explain the benefit of continuing to optimize the logistic or cross-entropy loss even after the training error is zero and the training loss is extremely small, and, as we show, even if the validation loss increases. Our methodology can also aid in understanding implicit regularization n more complex models and with other optimization methods.Comment: Final JMLR version, with improved discussions over v3. Main improvements in journal version over conference version (v2 appeared in ICLR): We proved the measure zero case for main theorem (with implications for the rates), and the multi-class cas

arXiv.org e-Print Archive